Skip to content

Add maxfailures option to limit test failures before stopping#61560

Open
VanitasCodes wants to merge 4 commits intoJuliaLang:masterfrom
VanitasCodes:add-maxfailures-option
Open

Add maxfailures option to limit test failures before stopping#61560
VanitasCodes wants to merge 4 commits intoJuliaLang:masterfrom
VanitasCodes:add-maxfailures-option

Conversation

@VanitasCodes
Copy link
Copy Markdown

Fixes #21594
Fixes #23375

This came out of the triage discussion on #61483, where @oscardssmith suggested that instead of changing the default behavior of @testset, we should add a maxfailures option that test runners can use to control how many failures are tolerated before stopping execution. This PR implements the Test stdlib side of that.

There's a global atomic counter that tracks failures and errors across testsets, and a configurable limit. When the count hits the limit, a MaxFailuresError is thrown and execution stops. The default limit is typemax(Int), so existing behavior is completely unchanged unless you explicitly call set_max_failures.

Four new functions are exported from Test:

  • set_max_failures(n) - set the limit
  • get_max_failures() - read the current limit
  • get_failure_count() - read the current failure count
  • reset_failure_count() - reset the counter to zero

The MaxFailuresError integrates with the existing is_failfast_error machinery, so it propagates correctly through nested testsets the same way FailFastError does. If both failfast and maxfailures are set, failfast takes precedence since it's checked first in record.

The intended usage is for something like Pkg.test(; maxfailures=10) to call Test.set_max_failures(10) before running tests, which would be a follow-up PR.

Copilot AI review requested due to automatic review settings April 11, 2026 02:07
@VanitasCodes
Copy link
Copy Markdown
Author

@oscardssmith Is this along the lines of what you had in mind?

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a maxfailures mechanism to Julia’s Test stdlib to stop a test run after a configurable number of failures/errors, without changing default @testset behavior.

Changes:

  • Introduces global atomic failure counter + limit, plus exported setters/getters/reset helpers.
  • Adds MaxFailuresError and integrates it with existing failfast propagation/printing paths.
  • Adds stdlib tests validating default behavior, stopping at N failures, counting errors, and invalid input.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
stdlib/Test/src/Test.jl Implements global max-failures tracking, exports API, adds MaxFailuresError, and wires it into record and top-level printing.
stdlib/Test/test/runtests.jl Adds integration-style tests that spawn a Julia process to verify max-failures behavior and messaging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread stdlib/Test/src/Test.jl Outdated
Comment thread stdlib/Test/src/Test.jl Outdated
Comment thread stdlib/Test/src/Test.jl Outdated
Comment thread stdlib/Test/src/Test.jl Outdated
Comment thread stdlib/Test/src/Test.jl Outdated
result = read(pipeline(ignorestatus(cmd), stderr=devnull), String)
@test occursin("Max failures reached: 1", result)
@test occursin("First", result)
@test !occursin(r"Test Summary:.*\n.*Second", result)
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These regexes assume \n line endings. To make the tests more robust across platforms/environments, consider matching \r?\n (or otherwise avoiding hard-coding newline style) so the assertion doesn’t become Windows-sensitive.

Copilot uses AI. Check for mistakes.
result = read(pipeline(ignorestatus(cmd), stderr=devnull), String)
@test occursin("Max failures reached: 2", result)
@test occursin("First", result)
@test !occursin(r"Test Summary:.*\n.*Second", result)
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These regexes assume \n line endings. To make the tests more robust across platforms/environments, consider matching \r?\n (or otherwise avoiding hard-coding newline style) so the assertion doesn’t become Windows-sensitive.

Copilot uses AI. Check for mistakes.
result = read(pipeline(ignorestatus(cmd), stderr=devnull), String)
@test occursin("Max failures reached: 1", result)
@test occursin("First", result)
@test !occursin(r"Test Summary:.*\n.*Second", result)
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These regexes assume \n line endings. To make the tests more robust across platforms/environments, consider matching \r?\n (or otherwise avoiding hard-coding newline style) so the assertion doesn’t become Windows-sensitive.

Copilot uses AI. Check for mistakes.
Comment thread stdlib/Test/src/Test.jl Outdated
Comment thread stdlib/Test/src/Test.jl Outdated
Comment thread stdlib/Test/src/Test.jl Outdated
Comment thread stdlib/Test/src/Test.jl Outdated
Comment thread stdlib/Test/src/Test.jl Outdated
@oscardssmith
Copy link
Copy Markdown
Member

Overall, I think this is the right direction (although seeing the changes in Pkg would be useful to see this in use).

@DilumAluthge
Copy link
Copy Markdown
Member

Thanks for putting this together!

Four new functions are exported from Test:

Do those functions need to be exported (or public)?

@VanitasCodes
Copy link
Copy Markdown
Author

@oscardssmith I have a few questions before I revise.

You mentioned the default should be 0 for backward compatibility. I originally went with typemax(Int) to mean "unlimited" so existing behavior stays the same, but I think you're suggesting 0 should mean "disabled/not counting" instead of "stop immediately"? That makes more sense. So the semantics would be 0 means the feature is off, and you'd call set_max_failures(10) to enable it with a limit of 10. Is that what you had in mind?

You're completely right about the atomics. I was overthinking the concurrency case, but tests run sequentially in a single process anyway. I'll switch to a simple Ref{Int} or just a module global.

For the API, if we make the limit nonatomic and just use direct access like global_failure_limit = n, do we still want helper functions at all, or should Pkg just set Test.global_failure_limit directly? I can see it going either way. The functions feel a bit over-engineered for what they do.

Should I draft a companion PR for Pkg showing how Pkg.test(; maxfailures=10) would use this, or wait until this settles?

@VanitasCodes
Copy link
Copy Markdown
Author

@DilumAluthge I originally exported them thinking they might be useful for custom test runners, but @oscardssmith pointed out they probably shouldn't be. I think you're right that they should either be public (documented but not exported) or just kept internal and accessed as Test.set_max_failures(). What do you think makes more sense? I'm leaning toward not exporting them and letting Pkg access them via the Test. prefix.

@DilumAluthge
Copy link
Copy Markdown
Member

just kept internal and accessed as Test.set_max_failures()

Keeping them internal (non-public) sounds good to me!

@VanitasCodes
Copy link
Copy Markdown
Author

@oscardssmith I've pushed an updated version based on your feedback. The atomics are gone and the globals are now plain Ref{Int} values that Pkg can set directly via Test.global_failure_limit[] = n and reset via Test.global_failure_count[] = 0. I also removed all the helper functions and the export line since they're no longer needed. Let me know if this is closer to what you had in mind or if you'd like any further changes.

@DilumAluthge Updated the PR to remove the exports entirely as you and @oscardssmith suggested. The globals are now accessed directly via Test.global_failure_limit[] and Test.global_failure_count[], which keeps things simple and internal.

Comment thread stdlib/Test/src/Test.jl Outdated
@VanitasCodes VanitasCodes force-pushed the add-maxfailures-option branch from db67957 to af210e1 Compare April 13, 2026 19:35
@VanitasCodes
Copy link
Copy Markdown
Author

@oscardssmith The count is now a Threads.Atomic{Int} while the limit stays a plain Ref since it's only written once before the run starts.

Comment thread stdlib/Test/src/Test.jl Outdated
@VanitasCodes VanitasCodes force-pushed the add-maxfailures-option branch from af210e1 to 130536b Compare April 13, 2026 20:53
@VanitasCodes
Copy link
Copy Markdown
Author

@oscardssmith Is there anything else you'd like me to address before this is ready?

@oscardssmith
Copy link
Copy Markdown
Member

Seems ready to me. I'd like to see the Pkg side PR before merging to get a real user of this, but the code here looks correct.

@VanitasCodes
Copy link
Copy Markdown
Author

@oscardssmith I'll put together a draft PR for the Pkg side showing the integration. I'll link it here once it's ready.

@DilumAluthge
Copy link
Copy Markdown
Member

@VanitasCodes Just wanted to check- have you used any AI tools when working on this PR? Either to generate the code or to write the PR description or comments?

@VanitasCodes
Copy link
Copy Markdown
Author

VanitasCodes commented Apr 15, 2026

@VanitasCodes Just wanted to check- have you used any AI tools when working on this PR? Either to generate the code or to write the PR description or comments?

I have had some assistance for the code, yes. But it was majorly used for understanding and navigating through the codebase more effectively.

@VanitasCodes
Copy link
Copy Markdown
Author

@oscardssmith quick update before I push. While testing locally I found that when MaxFailuresError is thrown inside include(), it gets wrapped in a LoadError, so the catch blocks were never seeing it as a MaxFailuresError and were printing "Fail-fast enabled" instead of "Max failures reached". Fixed it by unwrapping first with real_err = err isa LoadError ? err.error : err before the check. The correct message now shows up.

Also while trying to test the Pkg integration I realized Pkg isn't actually tracked in the Julia repo so the Pkg.test(; maxfailures=n) side would need to be a separate PR on JuliaLang/Pkg.jl. I have it written and tested locally but wanted to confirm that's the right place before I open anything there. Is that how it's supposed to work?

@oscardssmith
Copy link
Copy Markdown
Member

Is that how it's supposed to work?

Yeah, that's right.

@VanitasCodes VanitasCodes force-pushed the add-maxfailures-option branch from 130536b to 2e98e91 Compare April 16, 2026 01:56
@VanitasCodes
Copy link
Copy Markdown
Author

@oscardssmith just pushed the LoadError unwrapping fix. Can you have a look at it and lemme know if the changes look right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Top-level @testsets stop on errors testset with for loop stops at first failed test

4 participants